Goto

Collaborating Authors

 multiplicative weight update


Multiplicative Weights Update with Constant Step-Size in Congestion Games: Convergence, Limit Cycles and Chaos

Neural Information Processing Systems

The Multiplicative Weights Update (MWU) method is a ubiquitous meta-algorithm that works as follows: A distribution is maintained on a certain set, and at each step the probability assigned to action $\gamma$ is multiplied by $(1 -\epsilon C(\gamma))> 0$ where $C(\gamma)$ is the ``cost of action $\gamma$ and then rescaled to ensure that the new values form a distribution. We analyze MWU in congestion games where agents use \textit{arbitrary admissible constants} as learning rates $\epsilon$ and prove convergence to \textit{exact Nash equilibria}. Interestingly, this convergence result does not carry over to the nearly homologous MWU variant where at each step the probability assigned to action $\gamma$ is multiplied by $(1 -\epsilon)^{C(\gamma)}$ even for the simplest case of two-agent, two-strategy load balancing games, where such dynamics can provably lead to limit cycles or even chaotic behavior.



Learning compositional functions via multiplicative weight updates

Neural Information Processing Systems

Compositionality is a basic structural feature of both biological and artificial neural networks. Learning compositional functions via gradient descent incurs well known problems like vanishing and exploding gradients, making careful learning rate tuning essential for real-world applications. This paper proves that multiplicative weight updates satisfy a descent lemma tailored to compositional functions. Based on this lemma, we derive Madam---a multiplicative version of the Adam optimiser---and show that it can train state of the art neural network architectures without learning rate tuning. We further show that Madam is easily adapted to train natively compressed neural networks by representing their weights in a logarithmic number system. We conclude by drawing connections between multiplicative weight updates and recent findings about synapses in biology.


Multiplicative Weights Update with Constant Step-Size in Congestion Games: Convergence, Limit Cycles and Chaos

Neural Information Processing Systems

The Multiplicative Weights Update (MWU) method is a ubiquitous meta-algorithm that works as follows: A distribution is maintained on a certain set, and at each step the probability assigned to action $\gamma$ is multiplied by $(1 -\epsilon C(\gamma))> 0$ where $C(\gamma)$ is the ``cost of action $\gamma$ and then rescaled to ensure that the new values form a distribution. We analyze MWU in congestion games where agents use \textit{arbitrary admissible constants} as learning rates $\epsilon$ and prove convergence to \textit{exact Nash equilibria}. Interestingly, this convergence result does not carry over to the nearly homologous MWU variant where at each step the probability assigned to action $\gamma$ is multiplied by $(1 -\epsilon)^{C(\gamma)}$ even for the simplest case of two-agent, two-strategy load balancing games, where such dynamics can provably lead to limit cycles or even chaotic behavior.




Cautious Optimism: A Meta-Algorithm for Near-Constant Regret in General Games

Soleymani, Ashkan, Piliouras, Georgios, Farina, Gabriele

arXiv.org Artificial Intelligence

We introduce Cautious Optimism, a framework for substantially faster regularized learning in general games. Cautious Optimism, as a variant of Optimism, adaptively controls the learning pace in a dynamic, non-monotone manner to accelerate no-regret learning dynamics. Cautious Optimism takes as input any instance of Follow-the-Regularized-Leader (FTRL) and outputs an accelerated no-regret learning algorithm (COFTRL) by pacing the underlying FTRL with minimal computational overhead. Importantly, it retains uncoupledness, that is, learners do not need to know other players' utilities. Cautious Optimistic FTRL (COFTRL) achieves near-optimal $O_T(\log T)$ regret in diverse self-play (mixing and matching regularizers) while preserving the optimal $O_T(\sqrt{T})$ regret in adversarial scenarios. In contrast to prior works (e.g., Syrgkanis et al. [2015], Daskalakis et al. [2021]), our analysis does not rely on monotonic step sizes, showcasing a novel route for fast learning in general games. Moreover, instances of COFTRL achieve new state-of-the-art regret minimization guarantees in general convex games, exponentially improving the dependence on the dimension of the action space $d$ over previous works [Farina et al., 2022a].



Log-Normal Multiplicative Dynamics for Stable Low-Precision Training of Large Networks

Nishida, Keigo, Kıral, Eren Mehmet, Bannai, Kenichi, Khan, Mohammad Emtiyaz, Möllenhoff, Thomas

arXiv.org Machine Learning

Studies in neuroscience have shown that biological synapses follow a log-normal distribution whose transitioning can be explained by noisy multiplicative dynamics. Biological networks can function stably even under dynamically fluctuating conditions arising due to unreliable synaptic transmissions. Here we ask: Is it possible to design similar multiplicative training in artificial neural networks? To answer this question, we derive a Bayesian learning rule that assumes log-normal posterior distributions over weights which gives rise to a new Log-Normal Multiplicative Dynamics (LMD) algorithm. The algorithm uses multiplicative updates with both noise and regularization applied multiplicatively. The method is as easy to implement as Adam and only requires one additional vector to store. Our results show that LMD achieves stable and accurate training-from-scratch under low-precision forward operations for Vision Transformer and GPT-2. These results suggest that multiplicative dynamics, a biological feature, may enable stable low-precision inference and learning on future energy-efficient hardware.


Langevin Multiplicative Weights Update with Applications in Polynomial Portfolio Management

Feng, Yi, Wang, Xiao, Xie, Tian

arXiv.org Artificial Intelligence

We consider nonconvex optimization problem over simplex, and more generally, a product of simplices. We provide an algorithm, Langevin Multiplicative Weights Update (LMWU) for solving global optimization problems by adding a noise scaling with the non-Euclidean geometry in the simplex. Non-convex optimization has been extensively studied by machine learning community due to its application in various scenarios such as neural network approximation and finding Nash equilibrium. Despite recent progresses on provable guarantee of escaping and avoiding saddle point (convergence to local minima) and global convergence of Langevin gradient based method without constraints, the global optimization with constraints is less studied. We show that LMWU algorithm is provably convergent to interior global minima with a non-asymptotic convergence analysis. We verify the efficiency of the proposed algorithm in real data set from polynomial portfolio management, where optimization of a highly non-linear objective function plays a crucial role.